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Abstract 

This paper provides a theoretical analysis of diffraction-limited superresolution, demonstrating that 
arbitrarily close point sources can be resolved in ideal situations. Precisely, we assume that the incoming 
signal is a linear combination of M shifted copies of a known waveform with unknown shifts and ampli¬ 
tudes, and one only observes a finite collection of evaluations of this signal. We characterize properties 
of the base waveform such that the exact translations and amplitudes can be recovered from 2M -|- 1 
observations. This recovery is achieved by solving a a weighted version of basis pursuit over a continuous 
dictionary. Our methods combine classical polynomial interpolation techniques with contemporary tools 
from compressed sensing. 


1 Introduction 

Imaging below the diffraction limit remains one of the most practically important yet theoretically challenging 
problems in signal processing. Recent advances in superresolution imaging techniques have made substantial 
progress towards overcoming these limits in practice EZl 113, but theoretical analysis of these powerful 
methods remains elusive. Building on polynomial interpolation techniques and tools from compressed sensing, 
this paper provides a theoretical analysis of diffraction-limited superresolution, demonstrating that arbitrarily 
close point sources can be resolved in ideal situations. 

We assume that the measured signal takes the form 

M 

x{s) (1.1) 

i=l 

Here ipis, t) is a function that describes the image at spatial location s of a point source of light localized at 
t. The function ■0 is called the point spread function, and we assume its particular form is known beforehand. 
In ti,..., tM are the locations of the point sources and ci,..., cm > 0 are their intensities. Throughout 
we assume that these quantities together with the number of point sources M, are fixed but unknown. The 
primary goal of superresolution is to recover the locations and intensities from a set of noiseless observations 

{a;(s) I s G 5} . 

A mock-up of such a signal x is displayed in Figure]^ 

In this paper, building on the work of Candes and Fernadez-Granda mmaiii] and Tang et al HEa 
158] ■ we aim to show that we can recover the tuple (ti,Ci,M) by solving a convex optimization problem. 
We formulate the superresolution imaging problem as an infinite dimensional optimization over measures. 
Precisely, note that the observed signal can be rewritten as 

M 

xis) = = 

2=1 


ip{s,t)dpLi,{t). 


( 1 . 2 ) 
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Figure 1: An illustrative example of (1.11 with the Gaussian point spread function t) = . The ti 

are denoted by red dots, and the true intensities Ci are illustrated by vertical, dashed black lines. The super 
position resulting in the signal x is plotted in blue. The samples S would be observed at the tick marks on 
the horizontal axis. 


Here, /x* is the positive discrete measure where 6t denotes the Dirac measure centered at t. We 

aim to show that we can recover /x* by solving the following: 


minimize 




subject to x{s) 


ip{s, t)dfj,(t), s G S 


supp ^ C B 

/i > 0. 


(1.3) 


Here, H is a fixed compact set and w{t) is a weighting function that weights the measure at different locations. 
The optimization problem (1.3) is over the set of all positive finite measures /x supported on B. 

The optimization problem (1.21 is an analog of minimization over the continuous domain B. Indeed, if 
we know a priori that the are elements of a finite discrete set H, then optimizing over all measures subject 
to supp /X C n is precisely equivalent to weighted li minimization. This infinite dimensional analog with 
uniform weights has proven useful for compressed sensing over continuous domains m, resolving diffraction- 
limited images from low-pass signals [mug EH, system identification |49| , and many other applications HU. 
We will see below that the weighting function essentially ensures that all of the candidate locations are given 
equal influence in the optimization problem. 

Our main result. Theorem |1.2| establishes that for one-dimensional signals, under rather mild conditions. 


we can recover /x* from the optimal solution of (1.3). Our conditions, described in full-detail below essentially 
require the observation of at least 2M +1 samples, and that the set of translates of the point spread function 
forms a linearly independent set. In Theorem |1.3| we verify that these conditions are satisfied by the Gaussian 
point spread function for any M source locations with no minimum separation condition. 

Boyd, Schiebinger and Recht m show that the problem ( |1.3[ ) can be optimized to precision e in polyno¬ 
mial time using a greedy algorithm. In our experiments in Section we use this algorithm to demonstrate 
that our theory applies, and show that even in multiple dimensions with noise, we can recover very closely 
spaced point sources. 


1.1 Main Result 

We restrict our theoretical attention to the one-dimensional case, leaving the higher-dimensional cases to 
future work. Let x/) : —>■ K be our one dimensional point spread function, with the first argument denoting 

the position where we are observing the image of a point source located at the second argument. 

For convenience, we will assume that B = [—T, T] for some large scalar T. However, our proof will 
trivially extend to more restricted subsets of the real line. To define the weighting function in the objective 
of our optimization problem, fix a positive measure P on S and set 

w{t) = j ip{s,t)dP{s). 
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Here, for concreteness one can think of P as the uniform measure on the set S consisting of points at which 
we observe the signal. Just note that the particular form of P only has limited impact on the conditions of 
our main theorem. 

Before we proceed to state the main result, we need to introduce a bit of notation. We define the vector 
valued function u : M —)■ via 

u(s) = [V'(s,tl) ... ... ■ (1.4) 

Let us also define the vector valued function k : K —)■ via 

K{t) = J ip{t,s)v{s)dP{s). (1.5) 


and the matrix-valued function A : K^M+i ^ ^ 2 M+ix 2 M+i 


A(pi,...,_P2M+i) 


k(pi) 

1 


k{P 2 M+i) 

1 


( 1 . 6 ) 


Conditions 1.1. We impose the following four conditions on the point spread function ip: 


Integrability 

Positivity 

Independence 


'ip{-,t) e Lj,. 

For all t G [—T,T\ we have w{t) > 0. 

The matrix f v(s)v(s)^dP(s) is nonsingular. 


Determinantal There exists p > 0 such that for any t^^ ,tf G {ti — p,ti+ p), and t G [—T, T], 

the matrix A(tf ,t^,... ,t~f[,t~^,t^ is nonsingular whenever t,t~,t^ are distinct. 

Theorem 1.2. If ip satisfies the Conditions o and |iS| > 2M, then the true measure /r* is the unique 
optimal solution of (1.3). 


Note that the first three parts of Conditions 1 1.1 1 are both rather mild and rather easy to verify. Integra¬ 
bility simply requires the function to be integrable, and this is trivially true for a continuous point spread 
function and the uniform measure on a discrete set. Positivity eliminates the possibility that a candidate 
point spread function could equal zero at all locations—obviously we would not be able to recover the source 
in such a setting! Independence is satisfied if for any si,..., S2m G S, the matrix [u(si)... v{s2m)] is 
invertible. Such a condition is necessary if we want to recover the amplitudes uniquely assuming we knew 
the true C locations a priori. 

The fourth condition, Determinantal, looks impenetrable at first glance, and is indeed nontrivial to 
verify. As we will see below, this key condition is related to the existence of a canonical “dual-certificate” that 
is used ubiquitously in sparse approximation. In compressed sensing, this construction is due to Fuchs m, 
but its origins lie in the theory of polynomial interpolation developed by Markov and Tchebycheff, and 
extended by Gantmacher, Krein, Karlin and others (see the survey in Section 1.2). Importantly, we show it 
is satisfied by the Gaussian point spread function with no separation condition. 


Theorem 1.3. If |5| > 2M, then the Conditions 1.1 hold for ip{s,t) = e for any ti < ... < Im- 


Before turning to the proofs of these theorems (c.f. Sections 2.1 and 2.4), we survey the mathematical 
theory of superresolution imaging. 


1.2 Foundations: Tchebycheff Systems 

Our proofs rely on the machinery of Tchebycheff systems. This line of work originated in the 1884 
doctoral thesis of A. A. Markov on approximating the value of an integral f{x)dx from the moments 

ta xf{x)dx, ..., x‘^f{x)dx. His work formed the basis of the proof by Tchebycheff (who was Markov’s 
doctoral advisor) of the central limit theorem in 1887 |54| . 

^Tchebycheff is one among many transliterations from the cyrillic. Others include Chebyshev, Chebychev, and Cebysev. 
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Definition 1.4. A set of functions ui,...,u. 
ti < ... < tn, the matrix 

( Ui{ti) 


is called a Tchebycheff system (or T-system ) if for any 


\Uniti) 


litn)/ 


is invertible. 


An equivalent definition is that the functions form a T-system if and only if every linear 

combination U(t) = aiUift) -I- • • • -I- anUn(t) has at most n — 1 zeros. One natural example of a T-system 
is given by the functions Indeed, a polynomial of order n — 1 can have at most n — 1 zeros. 

Equivalently, the Vandermonde determinant does not vanish. 


1 


1 


t^‘ 


In 


t" 


7 ^ 0 , 


for any < ... < t„. Just as Vandermonde systmes are used to solve polynomial interpolation problems, T- 
systems allows the generalization of the tools from polynomial fitting to a broader class of nonlinear function¬ 
fitting problems. Indeed, given a T-system ui ,..., a generalized polynomial is a linear combination U{t) = 
aiUi{t) + - ■ ■+anUn{t). The machinery of T-systems provides a basis for understanding the properties of these 
generalized polynomials. For a survey of T-systems and their applications in statistics and approximation 
theory, see [5^1571155]. In particular, many of our proofs are adapted from |38| , and we call out the parallel 
theorems whenever this is the case. 


1.3 Prior art and related work 

Broadly speaking, superresolution techniques enhance the resolution of a sensing system, optical or otherwise; 
resolution is the distance at which distinct sources appear indistinguishable. The mathematical problem 
of localizing point sources from a blurred signal has applications in a wide array of empirical sciences: 
astronomers deconvolve images of stars to angular resolution beyond the Rayleigh limit [H], and biologists 
capture nanometer resolution images of fluorescent proteins iniisaiiziisE]- Detecting neural action potentials 
from extracellular electrode measurements is fundamental to experimental neuroscience |2fi| . and resolving 
the poles of a transfer function is fundamental to system identification |49| . To understand a radar signal, 
one must decompose it into reflections from different sources [33] ; and to understand an NMR spectrum, one 
must decompose it into signatures from different chemicals |52| . 

The mathematical analysis of point source recovery has a long history going back to the work of Prony |3D] 
who pioneered techniques for estimating sinusoidal frequencies. Modern analysis focuses on £i based methods, 
although this line of thought goes back at least to Caratheodory |15l flfij . It is tempting to apply the theory 
of compressed sensing [311131111111] to problem (]1.3[). If one assumes the point sources are located on a 


finite grid and are well separated, then some of the standard models for recovery are valid (e.g. incoherency, 
restricted isometry property, or restricted eigenvalue property). With this motivation, many authors solve 
the gridded form of the superresolution problem in practice [llllIllllllilMlSlEniEIllSH]. However, 
this approach has some significant drawbacks. The theoretical requirements imposed by the classical models 
of compressed sensing become more stringent as the grid becomes finer. Furthermore, making the grid finer 
can also lead to numerical instabilities and computational bottlenecks in practice. 

Despite recent successes in many empirical disciplines, the theory of superresolution imaging remains 
limited. Candes and Fernandes-Granada m recently made an important contribution to the mathematical 
analysis of superresolution, demonstrating that semi-infinite optimization could be used to solve the classical 
Prony problem. Their proof technique has formed the basis of several other analyses including that of 
Bendory et al [7] and that of Tang et al [52] . To better compare with our approach, we briefly describe 
the approach of [Tjuniia here. The classical compressed sensing arguments guarantee recovery of a sparse 
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signal by constructing a dual certificate Ha El. In the continuous setting of superresolution, this amounts 
to constructing a dual polynomial: a function of the form Q{t) = f ip{s,t)q{s)dP{s) satisfying 


\Q{t)\ < 1 

\Q{t,)\ = l, i = l,...,M. 


They construct g as a linear combination of ip^s^ti) and -J^ip{s,ti). In particular, they define the coefficients 
of this linear combination as the least squares solution to the system of equations 


Q{ti) = sign(ci), 


d 

dt 


Q{t) 


= 0 , 

t=ti 


i = 1,... ,M 
j = 1,..., M. 


(1.7) 


They prove that, under a minimum separation condition on the ti, the system has a unique solution because 
the matrix for the system is a perturbation of the identity, hence invertible. 

Much of the mathematical analysis on superresolution has relied heavily on the assumption that the point 
sources are separated by more than some minimum amount Oizillllllliaill]. We note that in practical 
situations with noisy observations, some form of minimum separation may be necessary. One can expect, 
however, that the required minimum separation should go to zero as the noise level decreases: a property 
that is not manifest in previous results. Our approach, by contrast, does away with the minimum separation 
condition by observing that this matrix need not be close to the identity to be invertible. Instead, we impose 
Conditions o to guarantee invertibility directly. Not surprisingly, we use techniques from T-systems to 
construct an analog of the polynomial Q in for our specific problem. 

Another key difference is that we consider the weighed objective f w{t)dp,{t), while prior work (TJHUIH] 
has analyzed the unweighted objective J dfj,(t). We, too, could not remove the separation condition without 
reweighing by w{t). In Section]^ we provide evidence that this mathematically motivated reweighing step 
actually improves performance in practice. Weighting has proven to be a powerful tool in compressed sensing, 
and many works have shown that weighting an fi-like cost function can yield improved performance over 
standard £i minimization pm 15^11551157]. To our knowledge, the closest analogy to our use of weights comes 
from Rauhut and Ward, who use weights to balance the influence of dynamic range of bases in polynomial 
interpolation problems m- In the setting of this paper, weights will serve to lessen the influence of sources 
that have low overlap with the observed samples. 

We are not the first to bring the theory of Tchebycheff systems to bear on the problem of recovering 
finitely supported measures. De Castro and Gamboa m prove that a finitely supported positive measure p. 
can be recovered exactly from measurements of the form 



whenever {mq) • ■ • ,Un} form a T-system containing the constant function uq = 1. These measurements are 
almost identical to ours; if we set Uk(t) = t) for k — 1,... ,n, where {si,..., s„} = 5 is our measurement 
set, then our measurements are of the form 


{a;(s) I s e 5} = / uidp, 


idfiY 


However, in practice it is often impossible to directly measure the mass f uod/i = f dp, as required by (1.3l. 
Moreover, the requirement that 1, tp(si,t ),..., ip(s„, t) form a T-system does not hold for the Gaussian point 
spread function 4’{s,t) = . Therefore the theory of [T^ is not readily applicable to superresolution 

imaging. 

We conclude our review of the literature by discussing some prior literature on superresolution without 
a minimum separation condition. The theory of signals with finite rate of innovation shows that given a 
superposition of pulses of the form ^ akip{t — tk), one can reconstruct the shifts tk and coefficients Ofc from 
a minimal number of samples ESI [55]. This holds without any separation condition on the tk and as 
long as the base function can reproduce polynomials of a certain degree see |28l Section A.l] for more 
details. The algorithm used for this reconstruction is however based on polynomial rooting techniques that 
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do not easily extend to higher dimensions. In contrast we study sparse recovery techniques which may 
be more stable to noise and trivially generalize to higher dimensions (although our analysis currently does 
not). Turning to recovery techniques, we would like to mention the work of Fuchs [3T] in the case that 
the point spread function is band-limited and the samples are on a regularly-spaced grid. This result also 
does not require a minimum separation condition. However, our results hold for considerably more general 
point spread functions and sampling patterns. Finally, in a recent paper Bendory presents an analysis 
of minimization in a discrete setup by imposing a Rayleigh regularity condition which, in the absence 
of noise, requires no minimum separation. Our results are of a different flavor, as our setup is continuous. 
Furthermore we require linear sample complexity while the theory of Bendory | 6 ] requires infinitely many 
samples. 


2 Proofs 

In this section we prove Theorem |1.2| and Theorem |1.3| We start by giving a short list of notation to be 
used throughout the proofs. 


Notation Glossary 

• We denote the inner product of /, 5 S Lp by {f,g)p ■= f f{t)g{t)dP{t). 

• For any differentiable function f : —>■ R, we denote the derivative in its first argument by dif and 

in its second argument by 82 /■ 

• For t gR, let ipti’) = 

• Finally, let Kp{t,T) = {ipt^ipr)p = f tpis,t)i>is,T)dP{s). 


2.1 Proof of Theorem 11.21 


We prove Theorem 1.2 in two steps. Firstly, we reduce the proof to 
{q,ijjt)p possesses some specific properties. 


constructing a function q such that 


Proposition 2.1. If the first three items of Conditions o hold, and if there exists a function q such that 
Q{t) ■— {q,'<ft)p satisfies 


Q{t,)=w{t,), j = (2.1) 

Q{t) < wftj), for t G [—T, T] and t tj, 


then the true measure /r* 


is the unique optimal solution of the program 


1.3 


This proof technique is somewhat standard [131 [31]: the function Q{t) is called a dual certificate of 
optimality. However, introducing the function w{t) is a novel aspect of our proof. The majority of arguments 
have w{t) = 1 . Note that when J ilj{s,t)dP{s) is independent of t, then w{t) is a constant and we recover 
the usual method of proof. 

In the second step we construct q{s) as a linear combination of the tpcentered point spread functions 
ijj(s,ti) and their derivatives d 2 ijj{s,ti). 


Theorem 2.2. Under the Conditions \l . 1\ there exist ai,..., aM, fii, ■ ■ ■ , Pm, c G R such that Q(t) = (g, fit)p 
satisfies ( 2 . 11 , where 

M 


='^{a^fi{s,ti)+Pi—fi{s,ti)) + c. 


To complete the proof of Theore m |1.2[ it remains to prove Proposition |2.1| and Theorem |2.2| Their proofs 
can be found in Sections 2.2 and |2 .3 j respectively. 
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2.2 Proof of Proposition |2.1| 


We show that /j,* is the optimal solution of problem (1.31 through strong duality. The dual of problem (1.31 
is 

maximize, {q,x)p 


subject to < w{t) for t G [—T, T]. 


( 2 . 2 ) 


Since the primal (1.31 is equality constrained, Slater’s condition naturally holds, implying strong duality. As 
a consequence, we have 

{q,x)p = J w{t)d^{t) q is dual optimal and /r is primal optimal. 


Suppose q satisfies (2.11. Hence q is dual feasible and we have 

M M 

{<l,x)p = p = ^CjQ{t^) 

i=i i=i 

= j w{t)d^ii,[t). 

Therefore, q is dual optimal and /x* is primal optimal. 

Next we show uniqueness. Suppose the primal (1.31 has another optimal solution 

M 

1=1 

such that {ti,. .. ^ {ti,... := T. Then we have 


{q,x)p = 


— CjQiij) + CjQ{ij) 
ij^'T ij^T' 


< 


Cjw(ij) + Y Cjw{tj) = J w{t)dfi{t). 


tjGT 


ij 


Therefore, all optimal solutions must be supported on {ti ,..., tM}- 

We now show that the coefficients of any optimal fi are uniquely determined. By condition Independence 
the matrix J v(s)v(s)^dP(s) is invertible. Since it is also positive semidefinite, then it is positive definite, 
so, in particular its upper M x M block is also positive definite. 


det 


I 






ip{s,tM)] dP{s) ^ 0. 


Hence there must be si,..., sm G S such that the matrix with entries '0(si, t-j) is nonsingular. 
Now consider some optimal /i = ^i^i- Since fl is feasible we have 

M M 

i=l i=l 


Since is invertible, we conclude that the coefficients Ci,... ,cm are unique. Hence /x*. is the unique 

optimal solution of (1.3l. 
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Figure 2: The point a is a nodal zero of /, and the point 6 is a non-nodal zero of /. 


2.3 Proof of Theorem 12.21 

We construct Q{t) via a limiting interpolation argument due to Krein [ID]. We have adapted some of our 
proofs (with nontrivial modifications) from the aforementioned text by Karlin and Stridden |38| . We give 
reference to the specific places where we borrow from classical arguments. 

In the sequel, we make frequent use of the following elementary manipulation of determinants: 

Lemma 2.3. If vq, ... ,Vn are vectors in K", and n is even, then 


Vi - Uo 


Vn - Vo 


Vl 

1 


Vn Vo 

1 1 


Proof. By elementary manipulations, both determinants in the statement of the lemma are equal to 


Vi-Vo ... Vn-Vo Vo 

0 ... 0 1 


□ 


In what follows, we consider e > 0 such that 


ti — e<ti+e<t2 — €<^2+ tM — e < Im + £■ 


Definition 2.4. A point t is a nodal zero of a continuous function / : K —> K if f{t) = 0 and f changes sign 
at t. A point t is a non-nodal zero if f{t) = 0 but f does not change sign at t. This distinction is illustrated 
in Figure^ 


Our proof of Theorem 2.2 proceeds as follows. With e fixed, we construct a function 


M 

QM = Y, 4^Kp{t, U) + /3Wa2Kp(t, tf) 

i=l 


such that (5e(t) = w{t) only at the points t = tj ± e for all j = 1, 2 ,..., M and the points tj ± e are nodal 
zeros of Qeit) — w{t) for all j = 1,2,..., M. We then consider the limiting function Q{t) = lim Qe{t), and 

prove that either Q{t) satisfies (2.11 or 2w{t) — Q{t) satisfies (2.1). An illustration of this construction is 
pictured in Figure 

We begin with the construction of Qe- We aim to find the coefficients a^, to satisfy 
Qe{ti — e) = w{ti — e) and Q,:{ti + e) = w{ti + e) for i = l,...,M. 

This system of equations is equivalent to the system 


Qe{ti - e) = w{ti - e) for i = l,...,M 
Qeiti + e) - Qe(ti - e) w(ti -I- e) - w(ti - e) 


2 e 


2 e 


for i = 1,..., M. 


(2.3) 















Figure 3: The relationship between the functions w{t), Qe{t) and Q{t). The function Qe{t) touches w{t) 
only at ti ± e, and these are nodal zeros of Qe{t) — w{t). The function Q{t) touches w{t) only at ti and these 
are non-nodal zeros of Q{t) — w{t). 


Note that this is a linear system of equations in with coefficient matrix given by 


Kp{tj - e,U) 

d 2 Kp{tj - e,ti) 

^{Kp{tj + e,ti) - Kp{tj - e,ti)) 

^{d 2 Kp{tj + e,U) - d 2 Kp{tj - e,U)) 


That is, the equations (2.31 can be written as 


■ 1 ■ 


w{ti - e) 

1 


w{tM - e) 

1 


+ e) - w{ti - e)) 

Pe 



. 1 . 


+ e) - w{tM - e))_ 


We first show that the matrix Kj is invertible for all e sufficiently small. Note that as e —)■ 0 the matrix 
Kj converges to 


Kp{tj,t,) 

d2Kp{tj,t,) 


diKp{tj,ti) 

did2Kp{tj,ti) 



= J v{s)v{s)'^dP{s), 


which is positive definite by Independence. Since the entries of converge to the entries of K, there is 
a A > 0 such that is invertible for all e G (0, A). Moreover, converges to as e —>■ 0 and for all 
e < A, the coefficients are uniquely defined as 


■ 1 ■ 


w{ti - e) 

Cte 

1 

1 


w(tM - e) 

+ e) - w{ti - e)) 

Pe 



1 


_^(w(tM + e) - - e))_ 


We denote the corresponding function by 


M 

Q,{t) + (3^^d2Kp{t,U). 

2 = 1 
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Before we construct Q{t), we take a moment to establish the following remarkable consequences of the De- 
TERMINANTAL condition. For all e > 0 sufficiently small the following hold: 

(a). Qe{t) = w{t) only at the points ti — e,ti + e,... ,tM — £,tM + £■ 


(b). These points ti — e,ti + e,... ,tM — e,tM + ^ are nodal zeros of Qe(t) — w(t). 

We adapted the proofs of (a) and (b) (with nontrivial modification) from the proofs of Theorem 1.6.1 and 
Theorem 1.6.2 of |38| . 

Proof of (a). Suppose for the sake of contradiction that there is a r S [—T, T] such that Qeir) = w(t) 
and T ^ {ti — e,ti + e,... ,tM — e,tM + e}- Then we have the system of 2M linear equations 


Qejij - e) 
w(tj - e) 
Qe{tj + e) 
w{tj + e) 


Qejr) 

w{t) 

Qeir) 

w(t) 


= 0 j = l,...,M 
= 0 j = l,...,M. 


Rewriting this in matrix form, the coefficient vector /Ij] = 
satisfies 


Ji] 




JM] 


[a^ Pe] - e) - k(t) K(ti + e) - k{t) ... K(tM + e) - ^(r)^ = [O ... O] . 


of Qe 


(2.5) 


By Lemma 2.3 applied to the 2M + 1 vectors Vi = k(<i — e),..., V 2 m = ^(^m + e)) and vq = k{t), the matrix 
for the system of equations (12.51) is nonsingular if and only if the following matrix is nonsingular: 


- e) 

1 


K{tM + e) k(.t) 

1 1 


= A(ti - + e,r). 


However, this is nonsingular by the Determinantal condition. This gives us the contradiction that 
completes the proof of part (a). 

Proof of (6). Suppose for the sake of contradiction that Qe(t) — w(t) has Ni < 2M nodal zeros and 
Nq = 2M — Ni non-nodal zeros. Denote the nodal zeros by {ti, ..., tjvj}, and denote the non-nodal zeros by 
zi,..., ZNg. In what follows, we obtain a contradiction by doubling the non-nodal zeros of Qe{t) — w(t). We 
do this by constructing a certain generalized polynomial u(t) and adding a small multiple of it to Q^{t) — w(t). 

We divide the non-nodal zeros into groups according to whether Qe(t) — w{t) is positive or negative in a 
small neighborhood around the zero; define 

X~ '■= {i \ Qe ^ w near Zi} and 1“*" '■= {i \ Qe > w near Zi}. 


We first show that there are coefficients gq, , gm, and 61 ,..., 6 m such that the polynomial 

M M 

u{t) = ^ GiKp{t, ti) + ^ hid 2 Kp{t, ti) + aow{t) 

i=l i=l 


satisfies the system of equations 

u{zj) = - 1-1 j G I~ 

= “1 i G 2 :+ 

u{Ti) = 0 i = 1,... ,Ni 

u{t) = 0, 


where t is some arbitrary additional point. The matrix for this system is 


/ «(zi)^ 1\ 


W 


k{zNo)^ 1 
«:(ti)^ 1 


k{tni)'^ 1 

V k{t) 1/ 


( 2 . 6 ) 
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w{t) 


Figure 4: The points {ti, r 2 , T3, T4} are nodal zeros of Qe(t) — w{t), and the points {(^ 1 , ^ 2 , Cs} ^re non-nodal 
zeros. The function u{t) has the appropriate sign so that Qe{t) — w{t) + 5u{t) retains nodal zeros at and 
obtains two zeros in the vicinity of each Q. 


where W = (\\&g{w{zi),... ,w[zno),w{ti), ... This matrix is invertible by Determinan- 

TAL since the nodal and non-nodal zeros of Qt(t) — w{t) are given by ti — e,... ,tM + e. Hence there is a 
solution to the system (2.6). 

Now consider the function 


M M 

U\t) = Q^{t) +Su{t) = + Sai]Kp{t,ti) + Sbi]d2Kp{t,t,) + Saow{t) 

where (5 > 0. By construction, u{Ti) = 0, so U^{t) — w{t) has nodal zeros at ti, ... ,tn^. We can choose <5 
small enough so that U^{t) — w{t) vanishes twice in the vicinity of each z^. This means that U^{t) — w{t) 
has 2M + Nq zeros. Assuming Nq > 0, select a subset of these zeros pi < .. . < P2M+1 such that there are 
two in each interval [ti — p^ti + p]. This is possible if e < p and S is sufficiently small. We have the system 
of 2M -I- 1 equations 


M 


M 


-I- 6ai]Kp{pi,U) + -|- Sbi\d2Kp{pi,U) 


(1 - Sao)w{T) 


M 

+ dai\Kp{p2M+i, 

2=1 


M 

U) + Y^[I3^^ + 6b,]d2Kp{p 
2 = 1 


(1 - Sao)w{T). 


Subtracting the last equation from each of the first 2M equations, we find that 

(aW -h (5ai,... -f (55 m) («:(pi) - k(p2M+i) • ■ ■ k(P2m) - k(P2M+i)) = (0,... ,0). 


This matrix is nonsingular by Lemma [2l3| combined with the Determinantal condition. This contradiction 
implies that Nq = 0. This completes the proof of (b). 

We now complete the proof by constructing Q(t) from Qe{t) by sending e —>■ 0. Note that the coefficients 
Q;e,/5e converge as e —)■ 0 since the right hand side of equation (2.4) converges to 


w{ti) 


T 



a 

w{tM) 


1 

w'{ti) 


1 

/3 

_w'{tM)_ 


1 


We denote the limiting function by 

M M 

Q{t) = ^ aiKp{t, ti) + l3AKp{t, U). 
2=1 2=1 


(2.7) 
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We conclude that w{t) — Q{t) does not change sign at the U since w(t) — Qe{i) changes sign only at ti ± e. 

We now show that the limiting process does not introduce any additional zeros of w{t) — Q{t). Suppose 
Q{t) does touch w{t) at some ti G [—T, T] with ti ^ ti for any i = 1, M. Since w{t) — Q{t) does not change 
sign, the points ti,..., tM, Ti are non-nodal zeros of w(t) — Q{t). We find a contradiction by constructing a 
polynomial with two nodal zeros in the vicinity of each of these M + 1 points (but possibly only one nodal 
zero in the vicinity of ti if ti = T or ri = —T). 

For sufficiently small 7 > 0, the polynomial 

W^{t) = Q{t) + 

attains the value w{t) twice in the vicinity of each ti and twice in the vicinity of ri. In other words there 
exist Pi < ... < P 2 M +2 such that Wy{pi) = w{pi). Therefore 

Q{Pi) = - l)'w{pi) for i = l,...,2M + 2, 

and so ~ 3 (p2m+i) = 0 for i = 1, 2, 2M. Thus, the coefficient vector for the polynomial Q{t) lies in 
the left nullspace of the matrix 


(k(pi) - k(P2M+i) h{p2m) - k{p2m+i)) ■ 

However, this matrix is nonsingular by Lemma and the Determinantal condition. 

Collecting our results, we have proven that Q{t) — w{t) = 0 if and only if t = ti and that Q{t) — w{t) 
does not change sign when t passes through ti. Therefore one of the following is true 


w(t) > Q{t) or Q{t) > w(t) 


with equality iff t = ti. In the first case, Q{t) 


Q{t) fulfills the prescriptions (2.1) with 


M 

q(t) = + /?» 

i=l 


d 

dti 


ipis.ti). 


In the second case, Q{t) = 2w{t) — Q{t) satisfies (2.11 with 

M 


= 2 - ^aii){s,ti) + 


2.4 Proof of Theorem 11.31 

Integrability and Positivity naturally hold for the Gaussian pointspread function 'ip{s,t) = 
Independence holds because '0(s, ti), ..., tM) together with their derivatives d 2 'ip{s, ti),..., d2'4>{s, tj^) 
form a T-system (see for example |35]). This means that for any si < ... < S 2 m G Rj 

\v{si) . ■.v{S2m)\ 7 ^ 0, 

and the determinant always takes the same sign. Therefore, by an integral version of the Cauchy-Binet 
formula for the determinant (cf. |87p. 


J v{.s)v{s)'^dP{s) ={ 2 M)\ j |u(si)...u(s 2 m)| 


Si<...<S2M 


v(siy 


’[s2My 


dP{si)... dP{s 2 M) 7 ^ 0. 


To establish the Determinantal condition, we prove the slightly stronger statement: 


|A(pi,...,P2M+i)| 


u(s) 

1 


w(pi) 


^(s,p2M + l) 
w(p2M + l) 


dP(s) 


7^0 


( 2 . 8 ) 


for any distinct pi, ■ ■ ■ ,p 2 M+i- When pi,... ,P 2 M+i are restricted so that two points Pi,Pj lie in each ball 
(tk — p,tk + p), we recover the statement of Determinantal. 

We prove (2.8) with the following key lemma. 
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Lemma 2.5. For any si < ... < S 2 M +1 o,nd ti < ... < Im, 


g-(si-ti)^ 

-(si 

g —(si— 

-(si 

1 


g—(-S2M + 1—* 1 )^ 

g—(S2M+1— 

-(S2M+1 

1 


^0. 


Before proving this lemma, we show how it can be used to prove (2.8). By Lemma 2.5 we know in 
particular that for any si < • • • < S2M+1, 


det 


v{si) 

1 


v{s2M+i) 

1 




and is always the same sign. Moreover, for any si < • • • < S2M+1, and any pi < ... < P2M+1, 

V'(si,pi) ... 'lp{si,P2M+l) 


det 


.'0(s2M+l)Pl) 1 /’(s2M+1,P2M+i). 


> 0 . 


Any function with this property is called totally positive and it is well known that the Gaussian kernel totally 
positive | 38 ]. Now, to show that Determinantal holds for the finite sampling measure P, we use an integral 
version of the Cauchy-Binet formula for the determinant: 


v{s) 

1 


•0(8,Pl) 

w{pi) 


= (2M + 1)! 


‘>P{s,p 2 M + l) 

‘w(P2M + l) 


v{si) 

1 


dP{s) 


Sl<-"<S2M+1 


v{s 2 M+l) 

1 


'i/'(8l,Pl) 

tu(pi) 


1p{s2M + l,Pl) 

tu(pi) 


1p{si,P2M + l) 
Kj(P2M+l) 


'ti{s2M + l ,P2M + l) 

w{p2M+l) 


dP{si). ..dP{s 2 M+l)- 
(2.9) 


The integral is nonzero since all integrands are nonzero and have the same sign. This proves (2.8). 


Proof of Lemma |^.5[ Multiplying the 2z — 1 and 2*-th row by e** and the i-th column by e®*, and subtracting 
ti times the 2 i — 1-th row from the 2*-th row, we obtain that we equivalently have to show that 


gSltl 

gS2tl 

gS2M + lil 


S2e®2*i . 

S2M+ie'''=‘i 

gSlt2 

gS2i2 

gS2M + li2 

gSl^M 

^S2tM 

g-S2M + l^M 

g^gSltM 


■ S2M+ie^"“+i*“ 

^2 

^2 

_2 

e"i 

6 2 

g*2AT + l 


7^0. 


The above matrix has a vanishing determinant if and only if there exists a nonzero vector 

(oi, 61, ..., Om, 6m, ClM+l) 

in its left null space. This vector has to have nonzero last coordinate since by Example 1.1.5. in [38], the 
Gaussian kernel is extended totally positive and therefore the upper 2M x 2M submatrix has a nonzero 
determinant. Therefore, we assume that om+i = 1- Thus, the matrix above has a vanishing determinant if 
and only if the function 

M 


y^jai + -I- e'" 
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has at least the 2M + 1 zeros si < S 2 < 
establishes that this is impossible. To complete the proof of Lemma 2.5 
Lemma 12.61 


< S 2 M+ 1 - Lemma [2.6| applied to r = M and di = ■ ■ ■ = cIm = !> 

it remains to state and prove 


□ 


Lemma 2.6. Let di, ...,dr G N. The function 

r 

+ OiiS + • • • + ai(2di-i)'S^'^’ 

has at most 2{di + • • • + dr) zeros. 

Proof. We are going to show that 4>di,...,dr.{s) has at most 2{di + • • • + dr) zeros as follows. Let 

9o(.^) ~ 4*di,....dri^) • 


For k = 1, ..., di + ■ ■ ■ + dr, let 

[5fc-i(s)e("b+«i+ '+L-i)s] ^ if fc = di +- h dj-i + 1 for some j, 

\^[5fc-i(s)], otherwise. 


( 2 . 10 ) 


If we show that _i-dr('S) has no zeros, then, gu^^.^ _|_d,.-i(s) has at most two zeros has at most two zeros, 

counting with multiplicity. By induction, it will follow that go{s) has at most 2(di + • • • +dr) zeros, counting 
with multiplicity. Note that if di + • • • + dj_i < k < di + ■ ■ ■ + dj-i + dj, then 


—(dj_2(fe-di+...+dj_i) H-+ ^ 

r 

+ ^ (hio + • • • + hi(2di-i)S^'^’ + c/i(r)e'’ 

i=i + l 


where c > 0 is a constant and r := s — Ci. We are going to show that fi{r) is a sum of squares polynomial 
such that one of the squares is a positive constant. This would mean that gkis) = fk{s)e’^ has no zeros. 
Denote 


Po{s) = 1 

Pi(s) = 2s 


Pi{s) = 2api-i{s - Ci) +p'_i(s - Ci), 


where ci,..., Ck are constants. It follows by induction that the degree of Pi{s) is deg(pi) = i and the leading 
coefficient of pi(s) is 2L 

We will show by induction that 

h{s) = p,{s)^ + + • • • + 




1=0 


When i = 0, we have that /o(s) = 1 and going to prove the general 

statement by induction. Suppose the statement is true for i — 1. By the relationship (2.101, we have 




1=0 


ds"^ 

= E - Ci)Pi-l(s - Ci) + - cy 


( 2 . 11 ) 


J=0 
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+ (4s^ + 2)p|i\(s - af + 8sp|i\(s - Ci)p^/_'[^\s - Ci)| 


We need to show that this expression is equal to Since 


j —0 2 ^j! 

Pi{a) = 2spi-i{s - Ci) +p-_i(s - Cj), 

it follows by induction that p\^\s) = 2jp[^Si^\s — Ci) + 2sp^\[s — Ci) + p^^i\s — Ci). Therefore we obtain 


i {j) ( \2 * 1 

^ ^ ^ (s - c*) + 2spli\(s - Ci) + (s - c*) 


n 2 


i=o 


i=o 


2^ j! 1 


J=0 


=6'" (2.12) 


+ ^jsp^i_-^^\s - Ci)p^il-^{s - c,)+ 

+ 4sp,^i\(s - Ci)p,^i+^^ + ^jp^iSi\s - - Ci) 

There are four types of terms in the sums ( 2.11[ ) and ( |2.12 1: 

Pii\(s-c*)^ s^p,^i\(s - Ci)^ p^iSi\s-c^)p^^\{s-a), and - c,)p-i\(s - Ci). 


For a fixed j S {0,1,..., i + 1}, it is easy to check that the coefficients in front of each of these terms in (2.111 
and (2.121 are equal. Therefore, 

f^{s) = p,isf + ^Piisf + • • • + 

' 1 0■)^„^2 




1=0 


Note that since deg(pi) = i, then, pf\s) equals the leading coefficient of Pi{s), which, as we discussed above, 
equals 2b Therefore, the term 2 ^Pi*^(s)^ = 2*d. Thus, one of the squares in /i(s) is a positive number, so 
/i(s) > 0 for all s. □ 


3 Numerical experiments 

In this section we present the results of several numerical experiments to complement our theoretical results. 
To allow for potentially noisy observations, we solve the constrained least squares problem 


minimize 

M>0 


'<p{s^,t)dp{t) - x{si) 

i=l 


(3.1) 


subject to / w(t)p{dt) < r 


using the conditional gradient method proposed in m- 


3.1 Reweighing matters for source localization 

Our first numerical experiment provides evidence that weighting by w{t) helps recover point sources near the 
border of the image. This matches our intuition: near the border, the mass of an observed point-source is 
smaller than if it were measured in the center of the image. Hence, if we didn’t weight the candidate locations, 
sources that are very close to the edge of the image would be beneficial to add to the representation. 
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We simulate two populations of images, one with point sources located away from the image boundary, 
and one with point sources located near the image boundary. For each population of images, we solve 
with w{t) = J '4’{s,t)dP{s) (weighted) and with w(t) = 1 (unweighted). We find that the solutions to 
recover the true point sources more accurately with w{t) = f 'ijj{s,t)dP{s). 

We use the same procedure for computing accuracy as in |48| . Namely we match true point sources to 
estimated point courses and compute the F-score of the match. To describe this procedure in detail, we 
compute the F-score by solving a bipartite graph matching problem. In particular, we form the bipartite 
graph with an edge between ti and tj for all i,j such that ||ti —ij|| < r, where r > 0 is a tolerance parameter, 
and ii,... are the estimated point sources. Then we greedily select edges from this graph under the 
constraint that no two selected edges can share the same vertex; that is, no ti can be paired with two 
or vice versa. Finally, the ti successfully paired with some tj are categorized as true positives, and we denote 
their number by Tp. The number of false negatives is Fn = M — Tp, and the number of false positives is 
N — Tp. The precision and recall are then P = pj"pp^ , and R = p^^p^ respectively, and the F-score is the 
harmonic mean: 

F=^ 

P + R' 

We find a match by greedily pairing points of {ri,..., r^} to elements of {ti, ..., ^m}) and a tolerance radius 
r > 0 upper bounds the allow distance between any potential pairs. To emphasize the dependence on r, we 
sometimes write F{r) for the F-score. 

Both populations contain 100 images simulated using the Gaussian point spread function 

^^(s, t) = e 


(O 

(3.1 


with a = 0.1, and in both cases, the measurement set 5 is a dense uniform grid of n = 100 points covering 
[0,1]. The populations differ in how the point sources for each image are chosen. Each image in the first 
population has five points drawn uniformly in the interval (.1, .9), while each image in the second population 
has a total of four point sources with two point sources in each of the two boundary regions (0, .1) and (.9,1). 
In both cases we assign intensity of 1 to all point sources, and solve (3.11 using an optimal value of r (chosen 
with a preliminary simulation). 

The results are displayed in Figure The left subplot shows that the F-scores are essentially the same 
for the weighted and unweighted problems when the point sources are away from the boundary. This is not 
surprising because when t is away from the border of the image, then f ip{s, t)dP{s) is essentially a constant, 
independent of t. But when the point sources are near the boundary, the weighting matters and the F-scores 
are dramatically better. 


3.2 Sensitivity to point-source separation 


Our theoretical results assert that in the absence of noise the optimal solution of (1.3) recovers point sources 


with no minimum bound on the separation. In the following experiment, we explore the ability of (3.1) to 


recover pairs of points as a function of their separation. The setup is similar to the first numerical experiment. 
We use the Gaussian point spread function with cr = 0.1 as before, but here we observe only n = 50 samples. 
For each separation d G {.Icr, .2cr,..., 1.9tT, 2a}, we simulate a population of 20 images containing two point 
sources separated by d. The point sources are chosen by picking a random point x away from the border of 
the image and placing two point sources at xF^. Again, each point source is assigned an intensity of 1, and 


we attempt to recover the locations of the point sources by solving (3.1). 


In the left subplot of Figure we plot F-score versus separation for the value of r that produces the best 
F-scores. Note that we achieve near perfect recovery for separations greater than The right subplot of 
Figureshows the observations, true point sources, and estimated point sources for a separation of ^ = j. 
Note the near perfect recovery in spite of the small separation. 

Due to numerical issues, we cannot localize point sources with arbitrarily small d > 0. Indeed, the F-score 
for ^ < I is quite poor. This does not contradict our theory because numerical ill-conditioning is in effect 
adding noise to the recovery problem, and we expect that a separation condition will be necessary in the 
presence of noise. 
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Performance away from boundary 


Performance near boundary 



r 



-Weighted 

- Unweighted 


Figure 5: Reweighing matters for source localization. The two plots above compare the quality of 
solutions to the weighted problem (with w{t) = f t)dP{s)) and the unweighted problem (with w{t) = 1). 
When point sources are away from the boundary (left plot), the performance is nearly identical. But when 
the point sources are near the boundary (right plot), the weighted method performs significantly better. 


3.3 Sensitivity to noise 


Next, we investigate the performance of (3.1) in the presence of additive noise. The setup is identical to the 


previous numerical experiment, except that we add Gaussian noise to the observations. In particular, our 
noisy observations are 

{x(si) +r]i\siGS} 

where rji ^ ^(0, 0.1). 

We measure the performance of (3.1) in Figure]^ Note that we achieve near-perfect recovery when 
d > a. However, if d < cr the F-scores are clearly worse than the noiseless case. Unsurprisingly, we observe 
that sources must be separated in order to recover their locations to reasonable precision. We defer an 
investigation of the dependence of the signal separation as a function of the signal-to-noise ratio to future 
work. 


3.4 Extension to two-dimensions 

Though our proof does not extend as is, we do expect generalizations of our recovery result to higher 


dimensional settings. The optimization problem (3.1) extends immediately to arbitrary dimensions, and we 
have observed that it performs quite well in practice. We demonstrate in Figure|^the power of applying (3.1) 
to a high density fluorescence image in simulation. Figure shows an image simulated with parameters 
specified by the Single Molecule Localization Microscopy challenge [33]. In this challenge, point sources are 
blurred by a Gaussian point-spread function and then corrupted by noise. The green stars show the true 
locations of a simulated collection of point sources, and the red dots show the support of the measure output 
by (3.1) applied to the greyscale image forming the background of Figure]^ The overlap between the true 
locations and estimated locations is near perfect with an F-score of 0.98 for a tolerance radius corresponding 
to one third of a pixel. 
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Performance vs. normalized separation Recovery for f ^ 




d 

a 


Figure 6: Sensitivity to point-source separation, (a) The F-score at tolerance radius r = 0.1 as a 
function of normalized separation ^. (b) The black trace shows an image for ^ ^. The green stars show 

the locations (x-coordinate) and weights (y-coordinate) of the true point sources. The red dots show the 
recovered locations and weights. 


Performance vs. normalized separation Recovery for ^ = 1.2 




Figure 7: Sensitivity to noise, (a) The F-score at tolerance radius r = 0.1 as a function of normalized 
separation (b) The black trace is the 50 pixel image we observe. The green stars show the locations (x- 
coordinate) and weights (y-coordinate) of the true point sources. The red dots show the recovered locations 
and weights. 
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Figure 8: High density single molecule imaging. The green stars show the locations of a simulated 
collection point sources, and the greyscale background shows the noisy, pixelated point spread image. The 
red dots show the support of the measure-valued solution of (3.1). 
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4 Conclusions and Future Work 


In this paper we have demonstrated that one can recover the centers of a nonnegative sum of Gaussians 
from a few samples by solving a convex optimization problem. This recovery is theoretically possible no 
matter how close the true centers are to one-another. We remark that similar results are true for recovering 
measures from their moments. Indeed, the atoms of a positive atomic measure can be recovered no matter 
how close together the atoms are, provided one observes twice the number of moments as there are atoms. 
Our work can be seen as a generalization of this result, applying generalized polynomials and the theory of 
Tchebycheff systems in place of properties of Vandermonde systems. 

As we discussed in our numerical experiments, this work opens up several theoretical problems that would 
benefit from future investigation. We close with a very brief discussion of some of the possible extensions. 

Noise Motivated by the fact that there is no separation condition in the absence of noise, it would be 
interesting to study how the required separation decays to zero as the noise level decreases. One of the 
key-advantages of using convex optimization for signal processing is that dual certificates generically give 
stability results, in the same way that Lagrange multipliers measure sensitivity in linear programming. 
Previous work on estimating line-spectra has shown that dual polynomials constructed for noiseless recovery 
extend to certify properties of estimation and localization in the presence of noise We believe 

that these methods should be directly applicable to our problem set-up. 

Higher dimensions One logical extension is proving that the same results hold in higher dimensions. 
Most scientific and engineering applications of interest have point sources arising one to four dimensions, 
and we expect that some version of our results should hold in higher dimensions. Indeed, we believe a 
guarantee for recovery with no separation condition can be proven in higher dimensions with noiseless 
observations. However, it is not straightforward to extend our results to higher dimensions because the 
theory of Tchebycheff systems is only developed in one dimension. In particular, our approach using limits 
of polynomials does not directly generalize to higher dimensions. 

Other point spread functions We have shown that our Conditions o hold for the Gaussian point 
spread function, which is commonly used in microscopy as an approximation to an Airy function. It will be 
very useful to show that they also hold for other point spread functions such as the Airy function and other 
common physical models. Our proof relied heavily on algebraic properties of the Gaussian, but there is a 
long, rich history of determinantal systems that may apply to generalize our result. In particular, works on 
properties of totally positive systems may be fruitful for such generalizations [11113]. 

Model mismatch in the point spread function Our analysis relies on perfect knowledge of the point 
spread function. In practice one never has an exact analytic expression for the point spread function. 
Aberrations in manufacturing and scattering media can lead to distortions in the image not properly captured 
by a forward model. It would be interesting to derive guarantees on recovery that assume only partial 
knowledge of the point spread function. Note that the optimization problem of searching both for the 
locations of the sources and for the associated wave-function is a blind deconvolution problem, and techniques 
from this well-studied problem could likely be extended to the super-resolution setting. If successful, such 
methods could have immediate practical impact when applied to denoising images in molecular, cellular, and 
astronomical imaging. 
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